perm filename SCCPP.TIM[TIM,LSP] blob
sn#708207 filedate 1983-04-26 generic text, type C, neo UTF8
COMMENT ⊗ VALID 00010 PAGES
C REC PAGE DESCRIPTION
C00001 00001
C00002 00002 SCCPP Multics
C00005 00003 SCCPP Rutgers
C00011 00004 SCCPP SAIL
C00015 00005 SCCPP Franz
C00020 00006 SCCPP Texas
C00023 00007 SCCPP Multics
C00024 00008 UTLISP 5.1
C00026 00009 LM-2
C00028 00010 InterLisp on SCORE
C00030 ENDMK
C⊗;
SCCPP Multics
∂06-Apr-81 1931 Bernard S. Greenberg <Greenberg at MIT-Multics> Re: Timing benchmark
Date: 6 April 1981 2147-est
From: Bernard S. Greenberg <Greenberg at MIT-Multics>
Subject: Re: Timing benchmark
To: RPG at SU-AI
Cc: "@LSPTRN.DIS[P,DOC]" at SU-AI
I ran this program three times on MIT-Multics. Heres what it said.
Runtime (secs) 86.4 86.4 86.8
gctime (secs) 2.4 3.8 2.5 (41 users out of max
set at 120)
Before we accept that this implementation is really 60 times
slower than MACLISP on SAIL, I would like to point out that
the unmodified program, sent in the mail, also ran on AI-ITS
MACLISP, also compiled, for FIVE SOLID WALL-CLOCK MINUTES (as
it did on Multics) without opening its mouth, but I quit it
before it finished on AI (several times). The KA-10 is reported
to be 2 or 3 times slower (instruction rate) than Multics.
The KL-10 at sail, given all benefit of the doubt, is NOT
TWO ORDERS OF MAGNITUDE faster than the AI KA-10.
Has anyone else encountered difficulty in making this program run
in small number of seconds of CPU time? Perhaps there is some
subtle conversion problem I missed in my perusal of this program?
(I made no modifications other than converting the characters to
lower-case, and on Multics, macroing *catch/*throw to catch/throw).
Did you compile these functions? If not, please compile them and
send those results too. Thanks.
-rpg-
SCCPP Rutgers
∂06-Apr-81 2008 HEDRICK at RUTGERS Re: Timing benchmark
Date: 6 Apr 1981 2307-EST
From: HEDRICK at RUTGERS
Subject: Re: Timing benchmark
To: RPG at SU-AI
In-Reply-To: Your message of 6-Apr-81 1602-EST
I translated your benchmark to R/UCI Lisp. The only non-obvious
translation was due to the fact that in R/UCI Lisp (as in all Lisp 1.6
derivatives) there can only be 5 arguments to compiled functions.
Fortunately the arguments beyond the 4th were always NIL in your case,
so I just eliminated them. (R/UCI Lisp has a special kind of function,
as LEXPR, that allows more than 5 arguments. However its use did not
seem necessary in this case.)
Totals, including GC:
R/UCI Lisp (with free space expanded by 50K):
interpreted: 15.0
compiled: 4.6
", NOUUO: 3.2 (of which .6 is GC)
By NOUUO I mean that linkage between compiled functions is by direct
PUSHJ, not going through the intepreter. This makes tracing and
breaking impossible. (In Elisp we will use a protocol that does not
have this problem.) Note that total runtimes are slightly better than
RPG's MACLisp timings. However more of it is runtime and less is GC.
I conjecture that this will be typical of Lisp programs whose only
arithmetic involves small integers. MACLisp will produce better
code for small integers, but will have to box and unbox them when returning
from functions or putting them into data structures, causing faster
runtime but more GC time. The first call is no different than others
because in R/UCI Lisp there is no automated expansion. We had to
explicitly expand free storage by 50K before running.
Elisp (extended addressing Lisp) does not yet have a compiler.
Therefore some of the system functions (e.g. MEMBER, CADR) are running
interpreted. This slows things down noticably. To get an idea of how
Elisp is going to work out, I compared it with a copy of R/UCI Lisp in
which the same functions are being interpreted. [The temptation is
great to simply code these things in assembly language, since there are
really only a few at this point. However I will attempt to resist this
temptation and continue to compare Elisp with this semi-interpreted
R/UCI Lisp.] Elisp uses dynamic expansion and contraction of memory.
However there is no apparent difference between the first time and
other times (possibly because memory has contracted to its initial
state by the end).
Elisp:
interpreted: 28.5 (E-box time, as used by RPG, 20.1 sec.)
R/UCI Lisp (with the same functions interpreted):
interpreted: 32.6
So Elisp is (as usual) a small amount faster than R/UCI Lisp. This
suggests that a good prediction for the final version of Elisp is 14
sec. interpreted. I am not ready to make predictions for Elisp compiled
code, as we don't know how the pager refill problem is going to affect
it. My guess is that it will be slightly slower than R/UCI Lisp, with
slowdown due to pager refills (from extended addressing) somewhat offset
by the better code generated by Utah's compiler.
Note that I normally report "normal" CPU time, not E-box times as used
by RPG. The E-box times will be noticably smaller in the case of Elisp.
I regard the use of E-box time with Elisp as problematical, since it
is significantly smaller than conventional CPU time, even with 0 load
on the system. I think this shows that E-box time omits some sort of
overhead that Elisp generates, probably pager refills (though the
documentation says that refills are not counted). Until someone convinces
me otherwise, I will regard conventional CPU time as a better index of
Elisp's load on the system. I report CPU times with fairly light (1 to 2)
load averages.
-------
SCCPP SAIL
Compiled
(RUNTIME 1.969) ;ebox
(GCTIME 29.914) ;ebox
(WRUNTIME 3.44501406)
(WGCTIME 52.3383193)
(WHOLINE 55.7833333)
2592
(RUNTIME 1.903)
(GCTIME 2.616)
(WRUNTIME 3.34783137)
(WGCTIME 4.6021686)
(WHOLINE 7.95)
2592
(RUNTIME 2.008)
(GCTIME 2.61)
(WRUNTIME 3.56552616)
(WGCTIME 4.6344738)
(WHOLINE 8.2)
2592
(RUNTIME 1.959)
(GCTIME 2.065)
(WRUNTIME 3.86217695)
(WGCTIME 4.0711564)
(WHOLINE 7.93333334)
2592
(RUNTIME 1.904)
(GCTIME 1.921)
(WRUNTIME 3.5259259)
(WGCTIME 3.55740738)
(WHOLINE 7.0833333)
Interpreted
2592
(RUNTIME 17.572)
(GCTIME 31.451)
(WRUNTIME 37.176616)
(WGCTIME 66.54005)
(WHOLINE 103.716666)
2592
(RUNTIME 17.191)
(GCTIME 2.586)
(WRUNTIME 39.1158924)
(WGCTIME 5.8841078)
(WHOLINE 45.0)
2592
(RUNTIME 17.32)
(GCTIME 2.642)
(WRUNTIME 35.8194237)
(WGCTIME 5.4639098)
(WHOLINE 41.2833333)
2592
(RUNTIME 17.414)
(GCTIME 2.103)
(WRUNTIME 40.106538)
(WGCTIME 4.84346205)
(WHOLINE 44.95)
2592
(RUNTIME 17.559)
(GCTIME 1.955)
(WRUNTIME 45.3507023)
(WGCTIME 5.049298)
(WHOLINE 50.4)
Compiled
2592
(RUNTIME 1.948)
(GCTIME 29.711)
2592
(RUNTIME 1.887)
(GCTIME 2.599)
2592
(RUNTIME 1.886)
(GCTIME 2.565)
2592
(RUNTIME 1.892)
(GCTIME 1.922)
2592
(RUNTIME 1.895)
(GCTIME 1.973)
;;; Model B
(fasload sccpp)
(timit)
Timing performed on Tuesday 04/26/83 at 12:21:49.
Cpu (- GC) Time = 1.688
Elapsed Time = 136.766666
Wholine Time = 59.9
GC Time = 35.419
Load Average Before = 0.84687555
Load Average After = 2.21817338
Average Load Average = 1.53252447
NIL
(timit)
Timing performed on Tuesday 04/26/83 at 12:24:14.
Cpu (- GC) Time = 1.644
Elapsed Time = 29.8833334
Wholine Time = 7.15
GC Time = 3.078
Load Average Before = 2.28072834
Load Average After = 2.58572388
Average Load Average = 2.4332261
NIL
(timit)
Timing performed on Tuesday 04/26/83 at 12:24:51.
Cpu (- GC) Time = 1.643
Elapsed Time = 23.1166666
Wholine Time = 8.4166666
GC Time = 2.936
Load Average Before = 2.5922866
Load Average After = 2.6549803
Average Load Average = 2.62363344
NIL
SCCPP Franz
∂11-Apr-81 1001 CSVAX.jkf at Berkeley result of pairs benchmark on franz.
Date: 11 Apr 1981 09:56:26-PST
From: CSVAX.jkf at Berkeley
To: rpg@su-ai
Subject: result of pairs benchmark on franz.
pair benchmark results
submitted by j foderaro
(csvax.jkf@berkeley)
10 april 81
Here are the results or running the pairs benchmark on Franz Lisp on
a VAX 11/780 runing Berkeley 4BSD Unix. The load average was less
than one when the timing was done. Timing on Unix is done in 60ths of
a second and the time charged to a process includes some of the system
overhead for memory management. I ran the benchmarks on an unloaded
system to reduce the interference of the memory manager.
The program was run with modification only to the test function
shown below. Perhaps you should in the future use macros like (cpu-time)
and (gc-time) and each site can define these macros.
(defun test ()
((lambda (t1 x gt)
(setq x (pairs a b () 'equal () () ()))
(setq t1 (- #-Franz (runtime)
#+Franz (car (ptime))
t1))
(setq gt (- #-Franz (status gctime)
#+Franz (cadr (ptime))
gt))
(print (length x))
(print (list 'runtime
(QUOTIENT (FLOAT (- t1 gt))
#-Franz 1000000.
#+Franz 60.)))
(print (list 'gctime
(quotient (float gt)
#-Franz 1000000.
#+Franz 60.)))
#+Franz (terpri))
#-Franz (runtime) #+Franz (car (ptime))
()
#-Franz (status gctime)
#+Franz (cadr (ptime))))
---
The size of the compiled file is 2768 bytes.
Here are the results::
Script started on Fri Apr 10 22:16:58 1981
Reval: lisp
Franz Lisp, Opus 34
-> (load 'pairs)
[fasl pairs.o]
t
-> (test)
2592(runtime 7.033333333333333)(gctime 23.53333333333333)
nil
-> (test)
2592(runtime 7.383333333333333)(gctime 4.816666666666667)
nil
-> (test)
2592(runtime 7.283333333333333)(gctime 4.366666666666667)
nil
-> (test)
2592(runtime 7.333333333333333)(gctime 4.666666666666667)
nil
-> (exit)
------
I looked at which functions were being called and it seems just about all this
benchmark does is call 'member' 10,000 times. I noticed that in our system
'memq' would do as good a job as 'member' so I replaced member by memq
and ran it with these results:
Reval: lisp
Franz Lisp, Opus 34
-> (load 'memqpairs)
[fasl memqpairs.o]
t
-> (test)
2592(runtime 1.683333333333333)(gctime 23.55)
nil
-> (test)
2592(runtime 1.733333333333333)(gctime 4.833333333333333)
nil
-> (test)
2592(runtime 1.766666666666667)(gctime 4.35)
nil
-> (test)
2592(runtime 1.783333333333333)(gctime 4.7)
nil
-> (exit)
script done on Fri Apr 10 22:21:50 1981
SCCPP Texas
∂07-Apr-81 2213 Mabry Tyson <ATP.Tyson at UTEXAS-20> SCCPP on UCI-Lisp
Date: 8 Apr 1981 0001-CST
From: Mabry Tyson <ATP.Tyson at UTEXAS-20>
Subject: SCCPP on UCI-Lisp
To: rpg at SU-AI
Results of LISP timing tests for UCI-Lisp
(Times are in the form R+G where R is the runtime (not including GC time)
and G is the GC time. All times are in seconds. "Interp" means interpreted,
"Slow" means compiled but using UUO-links, "Fast" means compiled with the UUO
links replaced by direct jumps.)
Processor
Program KL-2060 KI-1060
Interp Slow Fast Interp Slow Fast
SCCPP:
Free:
30000 14.00+2.78 5.32+2.38 3.38+2.78 53.43+8.58 21.14+11.62 12.77+11.56
14.04+2.83 5.37+2.70 3.35+2.71 20.40+11.47 12.63+11.36
3.34+2.70 20.72+11.50 12.71+11.44
60000 14.60+0.58 5.40+0.50 3.35+0.53 52.80+1.42 21.18+1.45 12.81+1.39
14.20+0.64 5.44+0.52 3.36+0.53 21.27+1.38 12.34+1.43
14.09+0.63 5.37+0.52 3.35+0.52 21.19+1.40 12.93+1.40
14.22+0.61 5.35+0.52 3.40+0.53
Notes: The functions with more than 5 arguments were changed to 5 argument
functions by making the last 3 into a list. These were involved in less than
3% of the run time so the difference is insignificant. The timings on the
2060 were with a load average less that 1. The timings on the KI-10 were with
a moderate load. The differences in the various timing runs are probably due
to system overhead due to swapping. The 30K free space examples had about 5
or 6 GC's while the 60K free space examples had one GC each.
-------
SCCPP Multics
;;; compiled
(test)
2592.
(runtime 3.690232)
(gctime 2.478373)
(test)
2592.
(runtime 3.679003)
(gctime 3.930743)
(test)
2592.
(runtime 3.693353)
(gctime 2.650682)
;;;UTLISP 5.1
Following are the timings for SCCPP for UTLISP 5.1. Because of local
interactive field length (= core size) restrictions, all runs were
submitted as batch jobs. "runtime" does not include "gctime".
Interpreted:
run#: #1 #2 #3 #4 #5
runtime: 18.15 18.17 18.25 18.22 18.26
gctime: 1.34 1.37 1.34 1.35 1.34
Run at 200000 (octal) field length with: 45300. words of free space
3030. words of full space
There were three garbage collections for each run.
Compiled:
run#: #1 #2 #3 #4 #5
runtime: 3.95 4.09 4.06 4.10 4.03
gctime: .97 .82 .84 .82 .83
Run at 20000 (octal) field length with: 45524. words of free space
2957. words of full space
There were three garbage collections for each run.
-------
;;;LM-2
;; 1. SCCPP
(DEFUN TEST-SCCPP ()
(LENGTH (TIMING "SCCPP"
(PAIRS A B () 'EQUAL () () ()))))
;; Compiled: 9.3 seconds.
;; Interpreted: 116.3 seconds.
;; This test seems to suffer from an incredibly cretinously written NCONC.
;; After adding NCONC optimizer to compiler (which will exist in 211):
;; Compiled: 7.9 seconds.
∂04-Aug-82 1052 HIC at SCRC-TENEX [DLA: forwarded]
Date: Wednesday, 4 August 1982 12:26-EDT
From: HIC at SCRC-TENEX
To: rpg at sail
Subject: [DLA: forwarded]
Date: Saturday, 31 July 1982, 09:46-EDT
From: David L. Andre <DLA>
To: HIC
cc: DLA
While I was editing UCADR, I decided to microcode a two-argument NCONC.
With this microcoded NCONC, the SCCPP benchmark radically improved:
;; Old results:
;; Compiled: 9.3 seconds.
;; Interpreted: 116.3 seconds.
;; This test seems to suffer from an incredibly cretinously written NCONC.
;; After adding NCONC optimizer to compiler (which will exist in 211):
;; Compiled: 7.9 seconds.
;; With microcoded NCONC (System 211, UCADR 920)
;; Compiled: 6.0 seconds.
You may wish to forward this to the appropriate people.
;;; InterLisp on SCORE
∂30-Sep-82 1218 James Bennett <csd.Bennett at SU-SCORE> new timings
Date: 30 Sep 1982 1207-PDT
From: James Bennett <csd.Bennett at SU-SCORE>
Subject: new timings
To: rpg at SU-AI
Stanford-Phone: (415) 497-2225
Home-Phone: (415) 322-2233
Dick,
Here they are:
6←(FLENGTH (TIME (PAIRS A B NIL 'EQUAL) 1 0]
collecting lists
3234, 10402 free cells
collecting lists
2842, 10010 free cells
collecting lists
3984, 10128 free cells
collecting lists
4052, 10196 free cells
collecting lists
16161, 16161 free cells
51493 conses
4.845 seconds
22.672 seconds, real time
2592
7←USE 4 FOR 1 IN 6
collecting lists
25873, 25873 free cells
collecting lists
9746, 10258 free cells
collecting lists
15907, 15907 free cells
collecting lists
6358, 10454 free cells
collecting lists
3642, 10298 free cells
collecting lists
4118, 10262 free cells
collecting lists
5956, 10052 free cells
collecting lists
35319, 35319 free cells
collecting lists
11874, 11874 free cells
collecting lists
38897, 38897 free cells
collecting lists
12308, 12308 free cells
205972/4 = 51493 conses
18.741/4 = 4.68525 seconds
200.712 seconds, real time
2592
10←DRIBBLE]
-------